Methods For Running Priority-Based Application Threads On A Realtime Component

ABSTRACT

Systems and methods for processing priority-based application threads on a realtime component are described. A mixing component submits blank buffers to the realtime component. The mixing component receives application thread data according to a priority-based schedule and writes the data using a second real-time thread to the buffers before the buffers into which the data is written are processed. Buffers are created on memory page boundaries with an offset into the memory page such that the least significant bits of a virtual memory address referencing the memory page can be used as an index into a circular buffer queue to determine which buffer is currently being processed. When writing into a buffer, a buffer that is a predetermined range of buffers behind the buffer currently being processed is used.

RELATED APPLICATION

This is a divisional application of and claims priority to U.S. patent application Ser. No. 09/960,873, the disclosure of which is incorporated by reference herein.

TECHNICAL FIELD

The present invention relates to running priority-based application threads on a realtime component. More particularly, the present invention relates to running priority-based audio component threads on a sound component based on a realtime schedule.

BACKGROUND

Most operating systems, including the WINDOWS family of operating systems by MICROSOFT, are priority-based models, meaning that the determination of which processes or threads get executed first are based on a level of importance—or priority—assigned to the processes or threads. For example, a thread dealing with spreadsheet calculation may be assigned a lower priority than a driver interrupt, because it is more important to process a driver interrupt (in the event, say, that a modem just received sixteen more characters) than it is to process the spreadsheet calculation. Part of the priority in this case is that other processes may depend on the fact that other processes depend on receiving the data from the modem.

One of the problems encountered with a priority-based system is that some processes may be pre-empted by a process having a higher priority. While this is not a problem in some cases, there are times when a pre-empted process may be pre-empted for longer than an acceptable amount of time. Streaming audio and video processes, for example, cannot be pre-empted for too long without experiencing unacceptable delays that result in glitches. Furthermore, if several applications are running that have threads that are assigned a high priority, the streaming media may be held off too long to execute properly.

Multimedia applications in particular have a goal of providing glitch-free operation so a user experiences a technically smooth operation. However, to obtain dependable glitch-free operation in a priority-based system, data must be submitted in large segments. Providing data in large segments causes a problem with high latency periods (the time from rendering the segment, or buffer, until the time the segment is played). High latency is undesirable for multimedia applications.

The high latency problem may be solved by submitting multimedia data in smaller segments. However, when this is done, the multimedia application is subject to interruption by higher priority threads. In a worst case scenario, several higher priority threads can be processed while the multimedia data is held off indefinitely.

A solution that many developers are turning to is to implement a realtime scheduling system, where each thread to be executed is allocated a certain percentage of guaranteed processor time for execution. Use of a realtime scheduler guarantees that a thread won't be delayed indefinitely or for too long to prevent unacceptable execution.

However, use of a realtime scheduling system presents another problem: even if components within an operating system are realtime-based, there are many applications, drivers, hardware, etc., that schedule threads on a priority basis. Therefore, the result is a component that schedules threads on a priority basis to a system component that schedules threads on a realtime basis. If threads are simply passed through to a realtime scheduler as they are received on a priority basis, the same limitations experienced with priority-bases systems will still be seen.

There is a need for a way to run priority-based components, or threads, on a realtime scheduler so that the priority-based component can experience the advantages of a realtime system.

SUMMARY

Systems and methods are described for running a priority-based component on a realtime scheduler. An operating system module (i.e., a WINDOWS kernel mode module) receives data from a user mode component that schedules threads on a priority basis. The systems and methods described herein may be described as a bridge between components running in normal threads and components running in realtime mode. As such, it applies to any system with boundaries between realtime and non-realtime components.

An audio mixer module provides an interface between a sound card/driver and one or more priority-based applications. The audio mixer receives data from the applications that the applications submit to the sound card for playing. A priority-based operating system allows the applications to send the data as priority allows. The audio mixer then sends the data received from the applications to the sound card, via the driver, according to a realtime scheduler.

To accommodate both the priority-based system and the real-time system, the audio mixer is configured to submit a series of empty buffers. The empty buffers are submitted to the driver to be played by the sound card. The sound card plays buffers in a circular queue, so empty buffers that are submitted must wait some time before they are played. This reserves space in the realtime system for data from the applications to be played. Before the sound card attempts to play an empty buffer, the audio mixer fills the buffer with data received from the applications. This allows data received from the applications according to a priority to be played according to a realtime schedule.

In one implementation of the invention described herein, the audio mixer performs a customized mix loop that allows the audio mixer to determine when an empty buffer is just ahead of the sound card and mix into that buffer to provide a playable buffer that will be played when the sound card is ready. The mix loop may be individually tailored to mix a certain amount of buffers before the buffers are played by the sound card and to synchronize the mixing with the playing by the sound card. If the buffers are mixed at a faster or slower rate than the sound card is playing the buffers (because the sound card and the audio mixer are running off different clocks) then the mix loop compensates mixing so that the mixing and the playing remain in synchronization.

Since the implementations described herein require that the mixer know exactly what data (i.e., which buffer) is being played at any given time, a tracking mechanism is described that allows the mixer to be informed of an exact memory location that is being played. To accomplish this, the buffers are allocated to begin on a page boundary in memory. Each buffer is offset into the memory page by a factor of the buffers position in the queue. For example, buffer zero is not offset into the page; buffer one is offset sixteen bits into the page; buffer two is offset thirty-two bits into the page; and so on. This allocation allows the tracking mechanism to use the bottom eight (or more) bits in a virtual memory address (which indicate on offset into a memory page) as an index number of a buffer within the circular queue.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an exemplary computer system on which at least one embodiment of the invention may be implemented, indicating data flow between components.

FIG. 2 is a flow diagram depicting a methodological implementation for running a non-realtime thread in an audio mixer.

FIG. 3 is a flow diagram depicting a methodological implementation for running a realtime mix thread in an audio mixer.

FIG. 4 is a diagram of an exemplary computer system that can be used to implement various aspects of various implementations of the invention.

DETAILED DESCRIPTION

Exemplary System

FIG. 1 is a block diagram of a basic computer system 100 showing components involved in the present invention and showing data flow between the components. The computer system 100 includes a processor 102, an input/output (I/O) device 104 configured to receive and transmit video and/or audio data, a display 106, a sound card 108 for playing audio data, and memory 110. The memory 110 stores an audio application 112, which is a non-realtime audio client playing audio data and a sound card driver 113 that is configured to drive the sound card 108

The audio application 112 sends data to an audio mixer 116, which is configured to receive audio data from various sources and mix the audio data to provide a single output to the sound card 108 via the sound card driver 113. The audio mixer includes a buffer submission module 118, a mix loop module 120 and a tracking module 122. The functions of the audio mixer modules will be discussed in greater detail below.

The audio mixer 116 transmits the audio data to the sound card driver 113 in a series of buffers 124. There may be virtually any number of buffers 124 designed to hold virtually any amount of audio data; however, in the described implementation, there are sixteen buffers (buffer[0]-buffer [15]), each buffer 124 being configured to store ten milliseconds of audio data.

The audio mixer 116 is configured to receive play position data 126 from the sound card driver 113. The play position data 126 is used by the mix loop module 120 and the tracking module 122 to determine an exact memory position being played by the sound card 108. The exact memory position that is being played is used to determine which buffer is being played and to gauge the synchronization between the timing of the sound card 108 and the timing of the audio mixer 116.

Further discussion of the components and functions of FIG. 1 will be discussed in greater detail with respect to FIG. 2 and FIG. 3. In the following discussion, continuing reference will be made to the reference numerals shown in FIG. 1.

Exemplary Buffer Submission Module

FIG. 2 is a flow diagram depicting a methodological implementation for running a non-realtime thread (buffer submission module 118) in the audio mixer 116. The non-realtime thread submits blank, i.e., empty, buffers and sends it to the sound card driver 113. When a buffer has been played, it is cleared and sent again. This cycle keeps a constant feed of empty buffers queued at the sound card driver 113 for submission to the sound card 108.

At block 200, a buffer is initialized by setting all bits in the buffer to zero. The blank buffer is sent from the audio mixer 116 to the sound card driver 113 at block 202. As long as the submitted buffer has not been played (“No” branch, block 204), another buffer is initialized (block 200) and submitted to the sound card driver 113 (block 202).

When a buffer has been played (“Yes” branch, block 204), then if there is any more data in the audio stream to be played (“Yes” branch, block 206), buffers are initialized and sent to the sound card driver 113 (blocks 200, 202). The process ends (block 208) when the audio stream has concluded.

As previously discussed, each buffer may be configured to hold an amount of data that can be played in one pre-determined time period. In addition, there may be a pre-determined number of buffers used so that it is possible to determine which buffers have been played. In the described implementation, there are sixteen buffers (buffer[0] to buffer[15]) that each hold an amount of audio data that can be played in ten milliseconds.

The buffers are sent to the sound card driver 113 to reserve a time slot for the sound card 108 to play the audio data contained in the buffers. At the time the buffers are submitted, the buffers do not contain any audio data. As will be discussed in greater detail below, however, the empty buffers are filled with audio data prior to being played by the sound card 108. The buffers are filled according to a realtime schedule, i.e., by a real timer scheduler.

Exemplary Mix Loop Module

FIG. 3 is a flow diagram depicting a methodological implementation for running a realtime mix loop thread (mix loop module 120) in the audio mixer 116 in accordance with an implementation of the present invention. In the implementation depicted by the flow diagram of FIG. 3, it is noted that the described system is configured to mix at least two buffers ahead of the buffer that the sound card 108 is currently playing, but not more than four buffers ahead. In other words, the mixing takes place two, three, or four buffers ahead of the playing. Given the present example, wherein each buffer contains ten milliseconds of audio data, this gives a latency of from twenty to forty milliseconds.

At block 300, the audio mixer 116 obtains a current audio position that indicates—by a hardware index and an offset—the buffer position and offset into the buffer that the sound card 108 is currently playing. If no valid position can be found (“No” branch, block 302), then a pre-roll flag is set to False (block 304). This ensures that the audio mixer index (the buffer being mixed into) will be resynchronized with the sound card index (the buffer being played) at the next available time.

After the pre-roll flag is set at block 304, if a buffer has not been mixed yet (“No” branch, block 306) then the process reverts back to block 300 and a current audio position is obtained. If a buffer has already been mixed (“Yes” branch, block 306), and the mixing (audio mixer 116) is outpacing the playing (“Yes” branch, block 308), then the mixing should yield for some period of time (block 312). Otherwise, soon the audio mixer 116 will have no more buffers to write to and—given sixteen buffers that can hold ten milliseconds of audio data each—one hundred and sixty milliseconds of latency will be experienced. This is probably unacceptable.

If the mixing buffer is less than four buffers ahead of the playing (“No” branch, block 308), then the audio mixer yields for a remainder of the time allotted to the thread (block 310). In the present example, the audio mixer is allotted ten milliseconds of time to mix buffers. If the mixing buffer is more than four buffers ahead of the playing (“Yes” branch, block 308), then the audio mixer yields for the remainder of the time allotted plus the following period (block 312). In this example, the audio mixer 116 would yield for the remainder of the current period plus ten milliseconds. After the yield period has been determined and executed, the process reverts to block 300 where a current audio position is obtained.

If a valid position is found (“Yes” branch, block 302), then it is determined at block 314 if the sound card is starting to pre-roll audio by playing silence prior to the first buffer. If so (“Yes” branch, block 314), then the audio mixer 116 and the sound card 108 are resynchronized to point to the same buffer (316). The pre-roll flag is then set at block 318. If pre-roll is not occurring (“No” branch, block 314), then no resynchronization is performed.

At block 320, it is determined whether to end any previous state of pre-rolling. If the offset into the playing buffer is non-negative (“Yes” branch, block 320), then the pre-roll flag is cleared (block 332).

At block 322, it is determined whether the audio mixer 116 is about to mix into a buffer that is currently playing. The delta, or difference, is measured as the number of buffers between the currently playing buffer and the next buffer to mix. If this number reaches zero while the pre-roll flag is false (block 322), the audio mixer 116 skips to the next buffer in the queue (block 324). This is not necessary if the mixing buffer is different from the playing buffer (“No” branch, block 322).

If delta is greater than ten (which could indicate that the audio mixer is more than ten buffers ahead or, more likely, six or more buffers behind)(“Yes” branch, block 334), then the audio mixer 116 skips to the next buffer in the queue at block 324 to help catch up with the processing of the sound card 108. Otherwise (“No” branch, block 334), no adjustment is necessary.

The audio mixer 116 mixes the appropriate buffer at block 326. The audio mixer 116 is then incremented to look ahead to the next buffer to be mixed (block 328). A new difference between the buffer being played and the buffer being mixed (delta) is determined at block 330. If the audio mixer 116 is less than three buffers ahead—i.e., delta<3—(“Yes” branch, block 330), then the process reverts to block 300 and repeats.

If the audio mixer 116 is three or more buffers ahead—i.e., delta >=3—(“No” branch, block 330), then the audio mixer 116 is too far ahead of the sound card 108 and the process reverts to blocks 306 through 312, where the audio mixer 116 yields some time to allow the sound card 108 processing to catch up.

Exemplary Tracking Module

The tracking module 122 is configured to accurately determine an exact memory position where the sound card 108 is playing. With USB (Universal Serial Bus) Audio, this is provided in terms of a pointer to physical data in the playing buffer and an offset (in milliseconds) from the beginning of the buffer. A problem arises because, although the USB Audio virtual memory address returned points to the same physical memory address as the virtual memory address used by the audio mixer 116, the two virtual memory addresses are different. Therefore, it is impossible to determine which buffer is playing (by audio mixer index) from the USB Audio information.

To solve this problem, the present invention may be implemented so that each buffer begins on a memory page boundary so that the bottom twelve bits of the virtual memory addresses used by the audio mixer 116 and by USB Audio will be identical. This is assuming a thirty-two bit virtual address as used in the WINDOWS family of operating systems by MICROSOFT, wherein the first twenty bits of a virtual address refer to page tables and page directories, and the last twelve bits of the virtual address refer to an offset into the page. To do this, the buffers must be allocated (possibly over-allocated) such that using the bottom bits is not a problem. This requires over-allocating the buffers by up to two times a page size (2 times 4K, or 8K, in WINDOWS).

When writing to the first buffer (buffer[0]), the data is written into the buffer beginning at bit zero. This means that there is no offset into the memory page being used and, therefore the bottom twelve bits of the virtual address are 0000 0000 0000—or 0x000. When writing the next buffer (buffer[1]), the data is written beginning sixteen (16) bits into the memory page being used. As a result, the last twelve bits of the virtual address for buffer[1] are 0000 0001 0000—or 0x010. Likewise, when writing the next buffer (buffer[2]), the data is written beginning thirty-two (32) bits into the memory page being used. As a result, the last twelve bits of the virtual address for buffer[2] are 0000 0010 0000—or 0x020. Data to each successive buffer is stored beginning sixteen (16) bits over (to the right) from the bit where the data began in the previous buffer. As a result, each buffer as indicated below has a virtual memory address ending in the twelve bits shown below for each buffer:

Buffer[0]=0x000 Buffer[8]=0x080

Buffer[1]=0x010 Buffer[9]=0x090

Buffer[2]=0x020 Buffer[10]=0x0A0

Buffer[3]=0x030 Buffer[11]=0x0B0

Buffer[4]=0x040 Buffer[12]=0x0C0

Buffer[5]=0x050 Buffer[13]=0x0D0

Buffer[6]=0x060 Buffer[14]=0x0E0

Buffer[7]=0x070 Buffer[15]=0x0F0

Therefore, determining which buffer is pointed to by the pointer returned is simply a matter of looking at the last twelve bits of the virtual memory address and mapping the value found to the buffer configuration.

An additional step that may be taken when the buffers are configured in this way is to check that the least significant bit of the virtual memory address is always zero. If the least significant bit of the virtual memory address is not zero, then it means that the virtual memory address is incorrect and an error has occurred somewhere in the processing that determines wherein the sound card 108 is currently playing.

Exemplary Computer System

FIG. 4 shows an exemplary computer system that can be used to implement various computing devices, i.e., client computers, servers and the like, in accordance with the described implementations and embodiments.

Computer 430 includes one or more processors or processing units 432, a system memory 434, and a bus 436 that couples various system components including the system memory 434 to processors 432. The bus 436 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. The system memory 434 includes read only memory (ROM) 438 and random access memory (RAM) 440. A basic input/output system (BIOS) 442, containing the basic routines that help to transfer information between elements within computer 430, such as during start-up, is stored in ROM 438.

Computer 430 further includes a hard disk drive 444 for reading from and writing to a hard disk (not shown), a magnetic disk drive 446 for reading from and writing to a removable magnetic disk 448, and an optical disk drive 450 for reading from or writing to a removable optical disk 452 such as a CD ROM or other optical media. The hard disk drive 444, magnetic disk drive 446, and optical disk drive 450 are connected to the bus 436 by an SCSI interface 454 or some other appropriate interface. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules and other data for computer 430. Although the exemplary environment described herein employs a hard disk, a removable magnetic disk 448 and a removable optical disk 452, it should be appreciated by those skilled in the art that other types of computer-readable media which can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, random access memories (RAMs), read only memories (ROMs), and the like, may also be used in the exemplary operating environment.

A number of program modules may be stored on the hard disk 444, magnetic disk 448, optical disk 452, ROM 438, or RAM 440, including an operating system 458, one or more application programs 460, other program modules 462, and program data 464. A user may enter commands and information into computer 430 through input devices such as a keyboard 466 and a pointing device 468. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are connected to the processing unit 432 through an interface 470 that is coupled to the bus 436. A monitor 472 or other type of display device is also connected to the bus 436 via an interface, such as a video adapter 474. In addition to the monitor, personal computers typically include other peripheral output devices (not shown) such as speakers and printers.

Computer 430 commonly operates in a networked environment using logical connections to one or more remote computers, such as a remote computer 476. The remote computer 476 may be another personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 430, although only a memory storage device 478 has been illustrated in FIG. 4. The logical connections depicted in FIG. 4 include a local area network (LAN) 480 and a wide area network (WAN) 482. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.

When used in a LAN networking environment, computer 430 is connected to the local network 480 through a network interface or adapter 484. When used in a WAN networking environment, computer 430 typically includes a modem 486 or other means for establishing communications over the wide area network 482, such as the Internet. The modem 486, which may be internal or external, is connected to the bus 436 via a serial port interface 456. In a networked environment, program modules depicted relative to the personal computer 430, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.

Generally, the data processors of computer 430 are programmed by means of instructions stored at different times in the various computer-readable storage media of the computer. Programs and operating systems are typically distributed, for example, on floppy disks or CD-ROMs. From there, they are installed or loaded into the secondary memory of a computer. At execution, they are loaded at least partially into the computer's primary electronic memory. The invention described herein includes these and other various types of computer-readable storage media when such media contain instructions or programs for implementing the steps described below in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described below.

For purposes of illustration, programs and other executable program components such as the operating system are illustrated herein as discrete blocks, although it is recognized that such programs and components reside at various times in different storage components of the computer, and are executed by the data processor(s) of the computer.

CONCLUSION

The above-described methods and systems provide a bridge between priority-based scheduling applications and a realtime scheduler. Existing applications utilizing a priority-based system can thereby be run on systems utilizing a realtime scheduler. The methods and systems also provide a way in which systems using different virtual memory addresses to address the same physical address can be manipulated to indicate a certain buffer that is being executed.

Although the invention has been described in language specific to structural features and/or methodological steps, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features or steps described. Rather, the specific features and steps are disclosed as preferred forms of implementing the claimed invention. 

1. A method for processing data from one or more priority-scheduled components with a realtime component, comprising: creating at least a first buffer and a second buffer; each buffer being empty; submitting the empty buffers to hardware in a buffer queue, the first buffer preceding the second buffer; receiving data from the priority-scheduled component to be processed by the realtime component; processing the received data in the realtime component; and writing realtime component output data to the first buffer before the first buffer is processed by the hardware, so that when the hardware processes the first buffer, the first buffer is not empty.
 2. The method as recited in claim 1, further comprising: clearing a buffer after the buffer has been processed by the hardware; and resubmitting the cleared buffer in the buffer queue, thereby creating a circular buffer queue.
 3. The method as recited in claim 1, further comprising creating each buffer so that each buffer can be identified by a unique offset from a memory page boundary.
 4. The method as recited in claim 3, wherein the beginning of each buffer is offset a number of bytes into a memory page from the memory page boundary, so that the low order bits of the address of the start of a buffer uniquely identify the buffer.
 5. The method as recited in claim 4, wherein the number of bytes comprising the offset for a buffer is a function of the buffer's position in the buffer queue.
 6. The method as recited in claim 4, wherein the number of bytes comprising the offset is a power of
 2. 7. The method as recited in claim 1, wherein the writing the realtime component output data further comprises: tracking which buffer is currently being processed by the hardware; and writing the data to a buffer other than the buffer that is currently being processed by the hardware.
 8. The method as recited in claim 7, wherein: the creating at least a first buffer and a second buffer further comprises creating multiple buffers so that each successive buffer is offset from a memory page boundary by a fixed number of bytes more than the previously created buffer; and the tracking further comprises identifying the buffer currently being processed by the hardware from the least significant bits of a virtual or physical address of the start of that buffer.
 9. The method as recited in claim 8, wherein: the fixed number of bytes further comprises a power of 2; and the number of least significant bits of a memory address used to identify the buffer currently being processed by the hardware is a function of a size of the memory page in bytes. 